HAUSDORFF MOMENT PROBLEM VIA FRACTIONAL MOMENTS 



1. Introduction 

In Applied Sciences a variety of problems, formulated in terms of linear boundary values or 
integral equations, leads to a Hausdorff moment problem. Such a problem arises when a given 
sequence of real numbers may be represented as the moments around the origin of non-negative 
measure, defined on a finite interval, typically [0, 1]. The underlying density f(x) is unknown, 
while its moments fij = J x J f(x)dx, j = 0, 1, 2, ...,with n$ = 1, are known. Next, through 
a variety of techniques, for practical purposes f(x) is recovered by taking into account only a 
finite sequence {^j}fLo- Such a process implies that f(x) is well-characterized by its first few 
moments. On the other hand, it is well known that the moment problem becomes ill-conditioned 
when the number of moments involved in the reconstruction increases [1,2]. In Hausdorff case, 
once fixed (/^o, Hm- 
where [3] 



,+ 



the moment \xm may assume values within the interval \p m ,(J>m\) 
Mm < 2- 2 < M - 1 > (1.1) 



Mm 



If one considers the approximating density fu{x) = exp(— X^jf=o ^jX-*) by entropy maximiza- 
tion, constrained by the first M moments [4], then its entropy H [fu] = — Jq Jm(x) In fM{x)dx 



satisfies 



lim H[f M ] 



-co 



(1.2) 



Such a relationship is satisfied by any other distribution constrained by the same first M mo- 
ments, since fM(x) has maximum entropy. On the other hand f(x) and /m(x) have the same 
first M moments and as a consequence, as we illustrate in section 3, the following relationship 
holds 

m 



M 



/(x)ln- 



-dx = H[f M ]-H[f}. 



(1.3) 



fu{x) 

Here H[f] is the entropy of f(x), while I(f, /m) is the Kullback-Leibler distance between f(x) 
and f M {x). 

Equations (1.1)-(1.3) underline once more the ill-conditioned nature of the moment problem. 
The ill-conditioning may be even enlightened by considering the estimation of the parameters Xj 
of Jm(x)- The Xj calculation leads to minimize a proper potential function T(Ai, AM)[Kesa 
4], with 



min r(Ai, Am) = min 

Ai,...,Am Ai,...,Am 



In 



M 



M 



exp(— XjX^)dx J + A 



;i-4) 



i=i 



fhi(x) satisfies the constraints 



/'.; 



„l M 

I x 3 exp(-^2x k x k l)dx, j = 0, ...,M 



Letting fj, = (/i > -,Mm) and A = (Aq, Am), (1-5) may be written as the map 

ijl = 4>(x) 



(1.5) 



(1.6) 



Then the corresponding Jacobian matrix, which is up to sign a Hankel matrix, has conditioning 
number ~ (1 + v / 2) 4A/ /v / M [5]. All the previous remarks lead to the conclusion that f(x) may 
be efficiently recovered from moments only if few moments are requested. In other terms, f(x) 
may be recovered from moments if its information content is spread among first few moments. 
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In this paper we are looking for a way to overcome the above-quoted difficulties in recovering 
fix) from moments. First of all, we assume the infinite sequence of moments {^j}JLo to be 
known. Then, from such a sequence, we calculate fractional moments 

„1 oo 

E(X a ') =: / x a if(x)dx = 5>n("iK> a, > (1.7) 

where the explicit analytic espression of b n (aj) is given by (2.5). Finally, from a finite num- 
ber of fractional moments {E(X a i )}fL\, we recover fu{x) = exp(— YljLo ^jX aj ) by entropy 
maximization [4]. The exponents {ajjjii are chosen as follows 

{aj}*£i : H[f M ] = minimum (1.8) 

The choice of {aj}jLi, according to (1.8), leads to a density /m(^) having minimum distance 
from fix), as stressed by (1.3). 

Remark. If the information content of f{x) is shared among first moments, so that ME ap- 
proximant /m{x) represents an accurate approximation of f(x), then fractional moments may 
be accurately calculated by replacing f(x) with /m-(x). As a consequence, function Jm(x) con- 
verges in entropy and then in L\— norm to fix) [6], and the error obtained replacing f(x) with 
fu{x) ^ 

| E f (X^) - E fM (X^) \< [ x^ | f{x) - f M (x) | dx < 

Jo 

<[ | f{x) - f M (x) I dx < y/2{H[f M ] ~ H[f]) (1.9) 
J o 

may be rendered arbitrarily small by increasing M (inequalities in (1.9) are proved in section 
3). 

2. Fractional moments from moments 

Let X a continuous random variable with density fix) on the support [0, 1], with moments of 
order s, centered in c, c 6 1R 

Hsic) := IE [iX - c) s ] = [ (x - c) s fix) dx, s£l*=IU{0}. (2.1) 

J o 

and moments from the origin /i s =: // s (0) related to moments generically centered in c through 
the relationship 

S / 

S 



Ms = E (^J c s " V(c), s£M*. (2.2) 

It is well known the relationship similar to (2.2) which permits to calculate the (fractional) 
moment of order s G M + (which replaces aj for notational convenience as in (1.7) and (3.2)) 
involving all the central moments of a given distribution about the point c. 
Firstly, by definition of noncentral moment of order s, we can write 1E(X S ) = f^ x s fix)dx and 
then, by Taylor expansion of x s around c, where c G (0, 1), we have 



(n) (x - c) n 

It. - r\ n 

(2.3) 



Z-j • n , 

n=0 

(x-c) 



s 

n\ x 

n 



E 

n=0 

e (:)=*->-) 



n! 



where [fc(x)]i= c indicates the n-th derivative of the function k(x) wrt x, evaluated at c. 
Taking the expectation on both sides of the last equation in (2.3), we get the required relation- 
ship 

OO / \ 

S 



E(X s ) = jr ( S ^j c s ~ n lE [(X - c) r 



n=0 

'DC 



(2.4) 



where 



71=0 



(2.5) 



represents the coefficient of the integral n-order moment of X centered at c. 
The formulation of the s-order fractional moments as in (2.4) shows some numerical instabilities 
which depend on the structure of the relationship between fi n (c) and 1E(X S ); these instabilities 
are related to the value of the center c and increase as the order of the central moments becomes 
high. In particular, 

(a) the numerical error A1E(X — c) n due to the evaluation of 1E(X — c) n in terms of noncentral 
integral moments lE(X h ), h < n, becomes bigger as c and n increase. In fact, 



\A1E(X -c) n \ 



^{-l) h (f\ c n ~ h AlE(X h 
h=o ^ ' 

^E(r) c n ~ h \AE(X h )\ 
h=o ^ ' 

= || AJE(X h ) IIooEQ 

h=0 ^ ' 



(2.6) 



c n-h = 



= || A E(X h ) |U (1 + c) n ~ eps (1 + c) n , 

where eps corresponds to the error machine, 
(b) the numerical error A1E(X S ) due to the evaluation of 1E(X S ) involving the first M„ 
central moments 1E(X — c) n , is given by 



Mr, 



\AJE{X* 



= E 

n=0 
M ma 

£ E 

n=0 

< || AE(X - c) 



c s - n A!E{X-c) 



c s - n \A!E(X - c) 



c max 



n 



M„ 



(2.7) 



AIE(X -c) n || c s max 



n=0 

I \ M ma7! + 1 



(I) 



- 1 



1 



- 1 



with max n (*) = ( [s / 2 ]) if H is even an d max n (*) = ( [s/ 2]+i) if H is odd > where [x] 
represents the integer part of x. The product of first two factors of the right hand side of 
(2.7) is an increasing function of c, whilst the last factor gives a function which decreases 
with c. 
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Hence, taking in account both (a) and (b), a reasonable choice of c could be c = \. Further, 
rewriting the last inequality in (2.7) as 



\AlE(Xy\ < || AlE(X-c) n c s max(^ ) -^-j <e 



100 n \n 



^ 1^ M max + 1 j 



we can reconstruct the s-order fractional moment with a prefixed level of accuracy e, e > 0, 
just involving a number of central moments equal to the value M max . 

3. Recovering f(x) from fractional moments 

Let be X a positive r.v. on [0, 1] with density f(x), Shannon-entropy H[f] = — J f(x) \nf(x)dx 
and moments {fJ.j}fL , from which positive fractional moments E{X a i) = J2^Lo b n {oLj)n n may 
be obtained, as in (2.4)-(2.5). 

From [4], we know that the Shannon-entropy maximizing density function /m(x), which has 
the same M fractional moments E{X a i ), of f(x), j = 0, M, is 

M 

f M (x) = exp(-^A,x Q 0- (3.1) 

3=0 

Here (Ao,...,Am) are Lagrangean multipliers, which must be supplemented by the condition 
that the first M fractional moments of /m(x) coincide with E(X a i), i.e, 

E{X a >) = [ x a 'f M (x)dx, j = 0,...,M, ao = l (3.2) 
J 

The Shannon entropy H[/m] of /m(x) is given as 

H[f M ] = ~ / fu{x) In f M (x)dx = V \jE{X a i). (3.3) 
Jo J=0 

Given two probability densities f(x) and /m(x), there are two well-known measures of the dis- 
tance between f(x) and Jm(x)- Namely the divergence measure /(/, Jm) = Jq f(x) In dx 

and the variation measure V(f, /m) = / \ fu{x) — f(x) \ dx. If f(x) and Jm(x) have the same 
fractional moments E(X a i), j = 1, ...,M then 

I(f,f M )=H[f M ]-H[f] (3.4) 

holds. In fact I(f,f M ) = ^f{ x )\n^dx = -H[f] + ££ A,- £ f M (x)dx = -H[f] + 
Ef=o^E(X^)=H[f M ]-H[f]. 

In literature, several lower bounds for the divergence measure I based on the variation measure 
V are available. We shall however use the following bound [7] 

V 2 

I>—. (3.5) 

If g(x) denotes a bounded function, such that | g(x) \< K, K > 0, by taking into account (3.4) 
and (3.5), we have 

I E f (g) - E fM (g) \< f I g{x) \ ■ | f(x) - f M (x) | dx < Ky/2(H[f M ) - H[f}) (3.6) 
Jo 



. Equation (3.6) suggests us what fractional moments have to be chosen 

{aj}jL 1 '■ H[f M ] = minimum (3.7) 

The use of fractional moments in the framework of ME relies on the following two theoretical 
results. The first is a theorem [8, Th. 2] which guarantees the existence of a probability density 
from the knowledge of an infinite sequence of fractional moments 

Theorem 3.1 [8, Th. 2] If X is a r.v. assuming values from a bounded interval [0, 1] and 
{aj}°^ is an infinite sequence of positive and distinct numbers satisfying lim aj = and 

J27Lo a j = +°°) then the sequence of moments {E(X aj )}° i L characterizes X. 



The second concerns the convergence in entropy of /m{x), where entropy-convergence means 
lim H[/m] = H[f\. More precisely, 

M— >oo 

Theorem 3.2. If {aj}jL are equispaced within [0,1), with aiM-j+i = M \ 1 , j = 0, ...,M then 
the ME approximant converges in entropy to f(x). 
Proof. See Appendix. 

We just point out that the choice of equispaced points oiM-j+i = m+t> 3 = 0, ...,M satisfies 
both conditions of Theorem 3.1, i.e. 

M 1 M 
lim au = and lim > a,- = lim (M + l) = +oo. 

M^oo M^oo ^ J M^oo M + 1 2 

3=0 

As a consequence, if the choice of equispaced aiM-j+i guarantees entropy-convergence, then the 
choice (3.7) guarantees entropy-convergence too. 

From a computational point of view, Lagrangean multipliers (Ai, Am) are obtained by (1.4), 
and the normalizing constant A is obtained by imposing that the density integrates to 1. Then 
the optimal {aj}j^ =1 exponents are obtained as 

{ojljii : min ^ min r(Ai,...,A M ) • (3. 



ai,...,QM 



\i,...,X 



M 



4. Numerical results 

We compare fractional and ordinary moments by choosing some probability densities on [0, 1]. 
Example 1. Let be 

7T 

f(x) = -sin(vrx) 

with H[f] ~ —0.144729886. From f(x) we have ordinary moments satisfying the recursive 
relationship 

1 n(n- 1) n q _ _ 1 

— o 2 n — 2,3,..., fl — 1, /il — -. 

Z 7T 2 

From {fi n }n=o we calculate E(X a i) = J2n=o b n( a j)^n, as in (2.4)-(2.5). From {E(X^)}f =0 
we obtain the ME approximant /m(x) for increasing values of M, where {aj}jL 1 satisfy (3.7). 
In Table 1 are reported 

a) H[/m) — H[f] = /(/, /m) and exponents {aj}fLi satisfying (3.7), where H[/m] is obtained 
using fractional moments. 

b) H[/m] — H[f] = I(f, /m), where H[/m] is obtained using ordinary moments. 
Inspection of Table 1 allows us to conclude that: 
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1) Entropy decrease is fast, so that practically 4-5 fractional moments determine f(x). 

2) On the converse an high number of ordinary moments are requested for a satisfactory char- 
acterization of f(x). 

3) Approximately 12 ordinary moments have an effect comparable to 3 fractional moments. 
f(x) and /m(x), obtained by 4-5 fractional moments, are practically indistinguishable. 

Table 1 

Optimal fractional moments and entropy difference of distributions having an 
increasing number of common a) fractional moments b) ordinary moments 





a 


) 




b) 


1V1 


In, \ M 


H[f M ] - H[f] 




7\/f 


TT\ f 1 TT \ fl 


1 


lO /I 1 Q1 

10.4101 


0.8716£- 1 






1 


U.yolU-C/ — Z 


o 
Z 




0.2938.E - 2 




A 

4 
























6 


0.7058£ - 3 


3 


0.04680 


0.3038£ - 3 










1.84212 






8 


0A442E - 3 




13.2143 


















10 


0.3357£ - 3 


4 


0.00220 


0.3276.E - 4 










2.76784 






12 


0.3288£- 3 




13.7293 












20.5183 










5 


0.0024 


0.1016E-4 










2.7000 












13.700 












20.500 












25.200 











Example 2. This example is borrowed from [9]. Here the authors attempt to recover a non- 
negative decreasing differentiable function f(x) from the frequency moments u> n , with 



Jn= /* [/(*)] 
J 



= / [f(x)] n dx, n= 1,2,... 



The authors of [9] realize that other density reconstruction procedures, alternative to ordinary 
moments, would be desirable. We propose fractional moments density reconstruction procedure. 
Here 

„ 1 , 1 1 

B = A 



fix) = 2 - + — ln(— 1) 

Jy ' .2 10 y Ax + B ) 



1 + e 5 ' 1 + e" 5 1 



with H[f] ~ —0.06118227 (f(x), compared to [9], contains the normalizing constant 2). From 
f(x) we have ordinary moments /j, n through a numerical procedure. From {/J> n }%Lo w 6 calculate 
E(X a i) = Er=oM a jW> as in (2.4)-(2.5). Finally, from {E(X a ^)}f =0 we obtain the ME 
approximant Jm(x) for increasing values of M, where {aj}fLi satisfy (3.7). 
Table 2 reports: 

a) H[/m] — H[f] = I(f, /m) and exponents {aj}fLi satisfying (3.7), where H[/m] is obtained 
using fractional moments. 
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b) H[/m] — H[f] = /(/, /m), where H[/m] is obtained using ordinary moments. 



Inspection of Table 2 allows us to conclude that: 

1) Entropy decrease is fast, so that practically 4 fractional moments determine f(x). 

2) An high number of ordinary moments is requested for a satisfactory characterization of f(x). 

3) Approximately 14 ordinary moments have an effect comparable to 4 fractional moments. 
Functions f(x) and /m(x), obtained by 4 fractional moments, are practically indistinguishable. 
As a consequence, we argue that the use of 4 fractional moments is as effective as that of 
8 frequency moments (as in [9]). The former ones, indeed, provide an approximant Jm{x) 
practically indistinguishable from f(x) (see figure 1 of [9]). 

Table 2 

Optimal fractional moments and entropy difference of distributions having an 
increasing number of common a) fractional moments b) ordinary moments 



a) 



b) 



M 




H[f M ] - H[f] 




M 


H[f M ] ~ H[f] 


1 


1.56280 


0.6278£ - 2 




2 


0.5718£- 2 


2 


0.52500 


0.3152£-2 




4 


0.1776£- 2 




3.90000 


















6 


0.1320£- 2 


3 


1.05000 


0.1169£-2 










3.00000 






8 


0.6744£ - 3 




7.87500 


















10 


0.3509£ - 3 


4 


0.44062 


0.1025£-3 










7.65470 






12 


0.2648E - 3 




12.5262 












63.9093 






14 


0.1914E- 3 



5. Conclusions 

In this paper we have faced up the Hausdorff moment problem and we have solved it using a low 
number of fractional moments, calculated explicitly in terms of given ordinary moments. The 
approximating density, constrained by few fractional moments, has been obtained by maximum- 
entropy method. Fractional moments have been chosen by minimizing the entropy of the ap- 
proximating density. The strategy proposed in the present paper, for recovering a given density 
function, consists in accelerating the convergence by a proper choice of fractional moments, so 
obtaining an approximating density by the use of low order moments, as (1.1) suggests. 
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Appendix: Entropy convergence 



A.l Some background 

Let's consider a sequence of equispaced points aj = j^j, j = 0, ...,M and 



p j =:E(X^) = I t a if M (t)dt, j = 0,...,M 
Jo 



(A.l) 



with /m(0 = exp(— YljLo ^jt aj )- With a simple change of variable x = t M +! , from (A.l) we 
have 



fij = E(X C 



M 



x 3 exp 



-(A - ln(M + 1)) - + Minx 



dx, j = 0,...,M (A.2) 



which is a reduced Hausdorff moment problem for each fixed M value and a determinate Haus- 
dorff moment problem when M — > oo. Referring to (A.2) the following symmetric definite 
positive Hankel matrices are considered 



A = mo, A 2 
whose (i, j)-th entry i, j = 0, 1, ... holds 



Mo Mi 
Mi M2 



,...,A 



2M 



Mo 
Mm 



Mm 

M2M 



(A3) 



/' 

J o 



Mi+j = / x l+J f M (x)dx, 



where /m(%) = exp — (Ao — ln(M + 1)) — A^x- 7 +M lnx . The Hausdorff moment problem 

is determinate and the underlying distribution has a continuous distribution function F(x), 
with density f(x). Then the massimal mass p(x) which can be concentrated at any real point 
x is equal to zero ([10], Corollary (2.8)). In particular, at x = we have 



= p(0) = lim p\ 



(o) 



A 2i 



M2 



Mi + l 



Mi+1 " ' M2i 



lim (mo - Mo 



(i), 



(A4) 



where indicates the largest mass which can be concentrated at a given point x = by any 
solution of a reduced moment problem of order > i and Mo ^ indicates the minimum value of 
Mo once assigned the first 2i moments. 

Let's fix {mo, Mi-i) Mi+ii Mm} while only Mi, i = 0, ...,M varies continuously. From (A.2) 
we have 

dXo/dni 



A 2 m • ; — —ei+i 

d\ M I dp 

where e^+i is the canonical unit vector 6 M M+1 , from which 

dAo / dp. 



(A5) 



< 



dXo d\M 
dpi ' '"' dpi 



•A 



2M • 



d\ M /dpi 



dX d\ M 
dpi''"' dpi 



e i+ i 



V* (A6) 

dpi 
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A. 2 Entropy convergence 
The following theorem holds. 



Theorem A.l If a 



,3=0, M and f M (x) = exp(- Y!Lq \ x<Xj ) then 



3 M+l 



lim H[f M ] =: - [ f M (x)\nf M (x)dx = H[f] =: - / f(x)\nf(x)dx. (A.7) 
Proof. From (A.l) and (A.7) we have 



M 



3=0 



(A.8) 



Let's consider (A.8). When only ^ varies continuously, taking into account (A.3)-(A.6) and 
(A.8) we have 



M 



A#[/ M ] = |^^+Ao = Ao-l 
d(j, dn 



dX 



A*2M 



*2M 



Mo -Ho 



(M) 



< 0. 



Thus H[/m] is a concave differentiable function of /io- When [i — ► ^ then -ff [/m] - * — oo, 
whilst at it holds H[/m] > H[f], being /m{x) the maximum entropy density once assigned 
(^0) ■•■) A*Af)- Besides, when M — > oo then Hq^ M ^ — > /io- So the theorem is proved. 
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Abstract 

We outline an efficient method for the reconstruction of a probability density function from the 
knowledge of its infinite sequence of ordinary moments. The approximate density is obtained 
resorting to maximum entropy technique, under the constraint of some fractional moments. The 
latter ones are obtained explicitly in terms of the infinite sequence of given ordinary moments. 
It is proved that the approximate density converges in entropy to the underlying density, so 
that it demonstrates to be useful for calculating expected values. 
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